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Error exponents characterize the exponential decay, when increasing message length, of the prob- 
ability of error of many error-correcting codes. To tackle the long standing problem of computing 
them exactly, we introduce a general, thermodynamic, formalism that we illustrate with maximum- 
likelihood decoding of low-density parity-check (LDPC) codes on the binary erasure channel (BEC) 
and the binary symmetric channel (BSC). In this formalism, we apply the cavity method for large 
deviations to derive expressions for both the average and typical error exponents, which differ by the 
procedure used to select the codes from specified ensembles. When decreasing the noise intensity, we 
find that two phase transitions take place, at two different levels: a glass to ferromagnetic transition 
in the space of codewords, and a paramagnetic to glass transition in the space of codes. 
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I. INTRODUCTION 

Communicating information requires a physical channel whose inherent noise impairs the transmitted signals. 
Reliability can be improved by adding redundancy to the messages, thus allowing the receiver to correct the effects 
of the noise. This procedure has the drawbacks of increasing the cost of generating and sending the messages, and 
of decreasing the speed of transmission. At first sight, better accuracy seems achievable only at the expense of lesser 
efficiency. Remarkably, Shannon showed that, in the limit of infinite-length messages, error-free communication is 
possible using only limited redundancy [1]. His proof of principle has triggered many efforts to construct actual 
error-correcting schemes that would approach the theoretical bounds. A renewal of interest for the subject has taken 
place during the last ten years, as new error-correcting codes were finally discovered [2], or rediscovered [3], which 
showed practical performances close to Shannon's bounds. 

In this paper, we analyze a major family of such codes, the low-density parity-check (LDPC) codes, also known 
as Gallager codes, from the name of their inventor [4]. Our focus is on the characterization of rare decoding errors, 
in situations where most realizations of the noise are accurately corrected. Error-free communication, as guaranteed 
by Shannon's theorem, indeed results from a law of large number, and is achieved only with infinite-length messages. 
Accordingly, any error-correcting scheme acting on finite-length messages has a non-zero error probability, which 
generically vanishes exponentially with the message length. Such error probabilities are described by error exponents, 
giving their rate of exponential decay. Two kinds of error exponents are usually distinguished: average error exponents, 
where the average is taken over an ensemble of codes, and typical error exponents, where the codes arc typical elements 
of their ensemble. 

The study of error exponents has attracted early on considerable attention in the information theory community, 
but exact expressions have turned particularly difficult to derive (see e.g. [5] and [6] for concise and non-technical 
^ \ reviews with entries in the literature) . Exact asymptotic results are known in the limit of the so-called random linear 
'k^ ' model [7] (presented in Appendix B), but only loose bounds (presented in Appendix C) have been established for more 
rN I general codes. Recently, a systematic finite-length analysis of LDPC codes under iterative decoding was carried out 
^ i for the binary erasure channel (BEC) [8, 9], yielding exact, yet non-explicit, formulae for the average error probability. 
■ " " ' Up to now, little has however been known of the error probability under maximum-likelihood decoding, except for 
the work of [10] dealing with the binary symmetric channel (BSC). 

We address here the problem of computing error exponents of LDPC codes under maximum-likelihood decoding, 
over both the BEC and BSC (all the necessary definitions are recalled below). We adopt a statistical physics point 
of view, which exploits the well established [II] mapping between error-correcting codes and spin glasses [12]. A 
thermodynamic formalism is introduced where error exponents are expressed as large deviation functions [13], which 
we compute by means of the extension of the cavity method [14] proposed in [15]. This approach offers an alternative 
to the related replica method employed in [10] and allows us to address both average and typical error exponents. 
We thus obtain an interesting phase diagram, with two very distinct phase transitions occurring when the intensity 
of the noise in the channels is varied. 
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FIG. 1: Error correction scheme. A message m composed of L bits, m G {0, 1}^, is first encoded in a codeword of longer size A'' 
witfi R = L/N > 1, defining tfie rate of tlie code. The noise ^ of the channel corrupts the transmitted codeword which becomes 
y (see Fig. 2 for examples of channels). This output is generically not a codeword, and the correction consists in inferring the 
most probable codeword to which it comes from. Finally, the inferred codeword x' is converted back into its corresponding 
message m! . The communication is successful if m' = m. 



A brief summary of our results can be found in [16]. We present in what follows a much more detailed account of our 
approach. In a first part, we define LDPC codes, recall their mapping to some models of spin glasses and optimization 
problems, and give a general overview of our thermodynamic (large deviation) formalism. The two subsequent parts 
apply this framework to the analysis of LDPC codes over the binary erasure channel (BEC) and the binary symmetric 
channel (BSC) respectively. We sum up our results in a conclusion where we also point out some open questions. 
Most of the technical calculations are relegated to the appendices, which also contain a detailed discussion of the 
limiting case of random linear codes. 



II. ERROR CORRECTING CODES AND THE LARGE DEVIATION FORMALISM 



A. Error correcting codes 



Error correcting codes are based on the idea that adding sufficient redundancy to the messages can allow the receiver 
to reconstruct them, even if they have been partially corrupted by the noisy channel [17]. A schematic view of how 
these codes operate is presented in figure 1. Given a message composed of L bits, an encoding map {0, 1}^ —> {0, 1}^ 
first introduces redundancy by converting the L bits of the message into a longer sequence of N bits, called a codeword. 
The ratio R = L/N defines the rate of the code, and should ideally be as large as possible to reduce communication 
costs, yet small enough to allow for corrections. Corrections are implemented downstream the noisy channel and 
specified by a decoding map {0, 1}^ {0, 1}^ whose purpose is to reconstruct the original message from the received 
corrupted codeword. Decoding is composed of two steps: first, the most probable codeword is inferred, and second, 
it is converted into its corresponding message. 

In this scheme, messages and codewords arc related by the one-to-one encoding map, and translating messages into 
codewords or conversely is relatively straightforward. The computationally most demanding part is concentrated on 
inferring the most probable codeword sent, given the corrupted codeword received. In what follows, we shall focus 
exclusively on this problem, which requires manipulating only codewords. 



B. Communication channels 



Formally, a noisy channel is characterized by a transition probability Q{y\x) giving the probability for its output 
to be y given that its input was x. For the sake of simplicity, we confine to memoryless channels where the noise 
affects each bit independently of the others, i.e., Q(y|x) = Y\^^i Q{yi\xi) with Q{yi\xi) independent of i. 

We shall consider more specifically two examples of memoryless channels. The first one is the binary erasure channel 
(BEC) where a bit is erased with probability p, that is Q{*\x) = p and Q{x\x) = 1 — p where * represents an erased 
bit (see Fig. 2). The second is the binary symmetric channel (BSC) where a bit is flipped with probability p, that is 
Q(0|1) = Q(1|0) ^ p and O(0|0) ^ Q{1\1) = l-p (sec Fig. 2). 



C. LDPC codes and code ensembles 



Shannon first formalized the problem of error correction and determined the lowest achievable rate R allowing 
error-free correction [1]. He found a general expression for this limit, called the channel capacity, which depends only 
on the nature of the channel and takes the form Cbec(p) = ^—p and Cbsc(p) = l—plnp— (l—p) ln(l— p) for the BEC 
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FIG. 2: Communication channels. On the left the BEC 
(binary erasure channel) erases a bit with probability p 
and leaves it unchanged with probability 1 ~ p. On the 
right the BSC (binary symmetric channel) flips a bit with 
probability p and leaves it unchanged with probability 1—p. 




M = 3 



N = 4 



FIG. 3: Factor graph (Tanner graph [18]). The circles 
represent the variable nodes, associated with the A'' bits 
{xi}, and the square represent the M parity-check. In the 
example given, the constraints read: (a): xi + X2 + x-^ — 0, 
(&): X2 + X3 = 0, (c): X2 + x^ + X4 = (modulo 2). 



and BSC respectively. Shannon's proof for the existence of codes achieving the channel capacity was non-constructive, 
and his analysis restricted to the limit of infinitely long messages, L 00. Amongst the various families of codes 
proposed to practically perform error correction, one of the most promising is the family of low-density parity-check 
(LDPC) codes [4]. 

A LDPC code is defined by a sparse matrix A where "sparse" means that A is mostly composed of O's, with 
otherwise a few Ts. The parity-check matrix A has size M x N with M = N — L, and is associated with a generator 
matrix G of size L x N such that GA = (see e.g. [3] for explicit constructions); the encoding map is taken to be the 
linear map x — Gm and the rate of the code is i? = L/N = 1 — M /N . By construction, a A^-bit codeword x satisfies 
the M parity-check equations Ax = 0, or. in other words, the set of codewords is the kernel of A. The parity-check 
matrix A is usually represented graphically by a factor graph, as in figure 3: the columns of A are associated with 
check nodes labeled with a S {1, . . . , M}, and represented by squares, and the lines of A are associated with variable 
nodes labeled with i G {1, .., . . . A^}, and represented by circles. A non-zero element of the matrix A such as Aia = 1 
appears as a link between the variable node i and the check node a. 

A particularly powerful approach for analyzing error-correcting codes is the probabilistic method where, instead 
of considering a single code, one studies an ensemble of codes. With LDPC codes, code ensembles corresponds to 
sets of matrices, or, equivalently, sets of factor graphs. A popular choice is to consider the ensemble of factor graphs 
with given connectivities Ck and Uf, that is the set of factor graphs having CfcAf check nodes with connectivity k 
and v^N variable nodes with connectivity where '^j.Ck = '^^vi = 1. A convenient representation is by means 
of the generating functions c{x) = Ckx'' and vg = vgx^; these notations allow for instance to write the mean 
connectivities as (k) = c'(l) and {£) = v'{l). Due to their simplicity, a particular attention will be devoted to regular 
codes, whose check nodes have all same degree k and variable nodes same degree i, corresponding to Ck' = Sk.k' and 
Vg' = Si^t, or, equivalently, c{x) = and v{x) = x^. 

The mathematical fact underlying the probabilistic method is the phenomenon of measure concentration which 
occurs in the limit where N 00 and M —f 00 with fixed ratio a = M/N: in this limit, many properties are shared 
by almost all elements of the ensemble (i.e., all but a subset of measure zero). As a consequence, by studying average 
properties over an ensemble, one actually has access to properties of typical elements of this ensemble. This fact is 
one of the building blocks of random graph theory [19] and is also central to the physics of disordered systems where 
it is known as the self-averaging property [20]. 

While the factor graph representation makes obvious the connection between LDPC codes and random graph theory, 
it will also turn particularly fruitful to exploit the close ties of LDPC codes with both optimization problems [21] and 
spin-glass systems [20]. LDPC codes are indeed intimately related to a class of combinatorial optimization problems 
known as XORSAT problems where, given a sparse matrix A and a vector r, one is to find solutions a to the equation 
Aa = T. Although algorithmically relatively simple (Gauss method provides an answer in a time polynomial in the 
size of the matrix), XORSAT problems share many common features with notably more difficult, NP-complete [21], 
problems such as A'-SAT. A recent physical approach to XORSAT problems makes use of their formal equivalence 
with a class of spin-glass systems known as p-spin models [22-24]. We shall follow this line of investigation and 
apply the cavity method [14, 25] from spin-glass theory to analyze LDPC codes. We note that alternative, sometimes 
equivalent, physical approaches have previously been applied to LDPC codes; we refer the reader to [26] for a review 
of the subject. 

The distinctive feature of XORSAT at the root of its computational simplicity is the presence of an underlying 
group symmetry that relates all solutions. In the context of LDPC codes, it corresponds to the fact that the set of 
codewords is the kernel of the parity-check matrix A; we shall refer to the XORSAT problem Aa = whose solutions 
define the set of codewords as the encoding- CSP of the LDPC code with check matrix A (CSP stands for constraint 
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satisfaction problem). The group symmetry has a number of interesting consequences which wiU crucially simplify 
the analysis. 

Most of the interest for LDPC codes stems from the possibility to decode them using efficient, iterative algorithms 
(described in Sec. Ill A3). Unless otherwise stated, we shall however be here concerned with the theoretically simpler, 
yet computationally much more demanding, maximum-likelihood decoding procedure. It consists in systematically 
decoding a received message to the most probable codeword (a task that iterative algorithms are in some cases unable 
to perform, as recalled in Sec. Ill A3). 

Finally, it is interesting to note that in the limit where (fc), {£) — > oo with fixed ratio, LDPC codes define the 
random linear model (RLM) whose typical elements have been shown by Shannon to achieve the channel capacity. 
This particular limit, where many quantities can be computed by invoking only elementary combinatorial arguments, 
is discussed in details in appendix B. 

D. Typical properties and phase transitions 

The performance of a particular code over a given channel is measured by its error probability i.e., the probability 
that it fails to correctly decode a corrupted codeword. More precisely, if d{y) denotes the inferred codeword when x 
is sent and y received, one defines the block error probability for x as 

P(v''^(x)=5]g(y|x)l,(y)^^, (1) 
y 

and the average block error probability as 

pf)=E.[P^^)(x)], (2) 

where Ex denotes the expectation (average) over the set of codewords. With LDPC codes, this average is trivial since, 
due to the group symmetry, all codewords are equivalent, and P]^ (x) is independent of x. 

The concentration phenomenon alluded above means here that Pjy ^ Pb with N —> oo within a given code 
ensemble defined by generating functions c{x) and v{x). As the level of the noise p is increased, a phase transition 
is generically observed: a critical value Pc exists above which error- free correction is no longer possible (ps = for 
p < Pc and = 1 for p > pc). The formalism to be presented in the next sections will yield in particular the value of 
Pc for given code ensembles and channels. Obviously, the presence of this phase transition indicates that, when using 
a channel with noise level p, one should choose a code from an ensemble for which p < p^. The phase transition is 
however occurring only in the limit of infinite codewords (thermodynamic limit) whereas practical coding inevitably 
deals with finite N . This leads to the fact that the block error probability is not exactly zero, even in the regime 

P <Pc- 

For a given code of finite but large block-length N ^ error can thus be caused by rare, atypical, realizations of the 
noise. Similarly, when picking a code at random from a code ensemble of finite size, one can observe properties 
differing from the typical properties predicted by the law of large numbers. We show in what follows how these two 
atypical features induced by finite-size effects can be analyzed in a common framework. 

E. Large deviations 

At this stage, it is useful to make explicit the three different levels of statistics involved in the analysis of error- 
correcting codes: [i) Statistics over the codes C in a defined code ensemble C; (m) Statistics over the set of transmitted 
codewords x of a particular code; {Hi) Statistics over the noise ^ of the channel, with a specified p. For given C, x, 
^, a fourth level of statistics is involved in the decoding process, over the possible codewords y e {0, 1}^ from which 
the received corrupted codeword originates. The group structure of the set of codewords of LDPC codes makes the 
level {ii) trivial since all codewords are in fact equivalent (isomorphic). We will consequently ignore it and address 
only the levels [i) and [Hi). 

The problem of evaluating the probability that, due to finite-size effects, a property differs from the typical case 
belongs to large deviation theory [13]. To give here a general presentation of the concepts and methods to be used, we 
assume that the success of the decoding is measured by a function Sn{S,,C) extensive in iV, and such that Sn{£,,C) < 
if the code C correctly decodes a message subject to noise ^, and Sn{^,C) > otherwise; in the next sections, we will 
show explicitly how such an observable can be defined with LDPC codes, both for the BEC and the BSC channels. 
In terms of 5jv, the decoding phase transition takes the following form: in the limit N — > oo, the distribution of the 
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TABLE I: This table presents the analogy with spin glasses or, more generally, the statistical physics of disordered system with 
quenched disorder. 

density Sn/N concentrates around a typical value styp(p) which verifies Styp(p) < if p < Pc, and Styp(p) > if 
p > pc, where p denotes as before the level of noise of the channel (see Fig. 2 for examples). 

For typical codes in their ensemble, denoted C°, we describe large deviations of Sn with respect to the noise ^ by 
a rate function Lo{s) such that the probability to observe Sn{^,C'^)/N = s satisfies 



N 



[C:SN{tC°)/N = s]>ie-^^"^'\ 



(3) 



Here the symbol ajv x refers to an exponential equivalence, Inajv/hifeAr ^ 1 as ^ oo. Viewed as a function 
of the noise level p, the rate function £'typ(p) = Lq(s = 0) is known in the coding literature as the typical error 
exponent [5]. The exponential decay with N of atypical properties is quite generic when dealing with large deviations, 
but this scaling is not necessarily insured, as discussed in more details in appendix A. In the thermodynamic formalism 
that we shall adopt, rate functions are computed by introducing a potential ^ci^) defined by 

$c(2;) = ln(Ej[e^^"(«''^)]) . (4) 

In the limit — > oo limit , the density ^c{x)/N tends to a typical value (I)q{x), which is related to the rate function 
Lois) by 



(5) 



Equivalently, by taking the saddle point, 

(l)oix) ^ XS - Lq{s), x^dsLois). (6) 

The rate function Lq{s) can thus be reconstructed from 0o(a;) by inverting the Legendre transformation, 

Lo(s) = sx~ (poix), s = dx<i)o{x). (7) 

The analogy with usual thermodynamics is summarized in table I. 

From a theoretical perspective, it is simpler to make an average over the codes and compute the rate function Li{s) 
defined as 

Pjv[e,C : SN{i,C)/N = s] X e-^^i(^). (8) 

: Li{s = 0). In the thermodynamical formalism. 



This procedure yields the so-called average error exponent, E. 
Li{s) is conjugated to the potential <?iii(x) satisfying 



E 



(9) 



The two rate functions Lq{s) and Li{s) may differ, meaning that the average exponent can be associated with atypical 
codes. Such atypical codes correspond themselves to large deviations of the potential ^c{x)- For fixed values of x, 
we define a rate function C{(j),x) as 



^n[C : $c(^)/iV = 0] X c-^^^-^'"). 



(10) 
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TABLE II: Analogy with the replica approach of spin glasses. The replica symmetric method prescribes that the typical partition 
function Zo of a disordered system is given by Zo ^ ¥,[Z^]^^" with n — > or, more precisely, if Aat = InZjv, the typical value 
of A = An/N is Ao ~ lim„^o limjv^cxD(l/A'^'^) lnE[e""^™]. This is mathematically justified by the Gardner-Ellis theorem which 
moreover provides a rigorous basis for the interpretation of non-zero values of n in terms of large deviations, as discussed in the 
text. According to this theorem, if the function (l>{x) = limjv^oo(l/A'^) lnE[e^^"] exists and is regular enough (see e.g. [13] for 
a rigorous presentation), then a large deviation principle holds for A with a rate function being the Legendre transform of 4>{x); 
if we assume the functions differentiable, L{X) — \x — <j){^) with A — dx<j}{x). As a corollary of this theorem, the typical value 
Ao, which by definition satisfies L{Xo) = and x — d\L{Xo) — 0, is given by Aq = dx4>{x = 0) = Vanx-,Q[(:f>{x) / x]{x — 0), as 
predicted by the replica method. Note also that n = 1, with Zi — E[Zjv], corresponds to the so-called annealed approximation. 



In a thermodynamic formalism, C{(j),x) is again associated with a potential tp{x^y) defined by 



,Ar[y0-£(0,x)] 



(11) 



We refer to this hierarchical embedding of large deviations as a multi-step large deviation structure [15], a term 
meant to reflect the formal equivalence with the multi-step replica symmetry breaking scenario developed for spin 
glasses [20] (see table II). In the limit N ^ oo where the integral is dominated by its saddle point we obtain the 
Legendre transformation 

'^{x,y) =y(j)~ C{(j),x), y = d^£{(j),x). (12) 

Within this extended framework, we recover the average case by taking y = 1. Indeed, from the definitions (9) of 
(j)i{x) and (11) of ip{x,y) it follows that 

^Ni,{.,i) ^ E<.[E^e^"(«''^^] - E(^,c)[e"^"(«'''^] - e"^"^^^") (13) 

that is, 



i:{x,y=l) = Mx)- (14) 

This average case differs in general from the typical case which corresponds to y = 0. Indeed, by definition 
[see Eq. (10)], typical codes are associated with the potential (j)o minimizing £.{(f>,x), with C{(j)Q,x) = 0, yielding 
y = d^C = 0. Note that the potential 0o is related to ^{x,y) by 4'o{x) = limy—,Q{l/y)'>p(x,y), which can also be 
viewed as a corollary of Gartner- Ellis theorem [13], best known in statistical physics as the replica trick [20] (see 
table II). In the language of the replica method, the average case (y = 1) and the typical case (y = 0) are respectively 
referred to as the annealed and quenched computations. 

The previous discussion assumed that the potentials were analytical functions of their parameters x and y, but this 
may not be the case, and we will find that phase transitions can occur when these temperatures are varied. In such 
cases, taking naively the limit y — + leads to erroneous results. We will discuss how to overcome such difficulties 
when encountering them. 



III. LDPC CODES OVER THE BEC 



We now proceed to illustrate our formalism with LDPC codes over the binary erasure channel (BEC). We start with 
rederiving the typical phase diagram by means of the cavity method, a slightly different approach than the replica 
method originally used in [27]. This sets the stage for the analysis of error exponents that follows. 
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A. Typical phase diagram 

1. Formulation 

Consider a LDPC code C with parity-check matrix A; its encoding- CSP (the constraint satisfaction problem whose 
SAT-assignments define the codewords) has cost function 

M JV 
iJcM =^S,[(7], with Ea[<J]=J2Aa^<y^ (mod2). (15) 

a=l i=l 

Since Ea[(T] E {0,1}, the cost function Hc[cr] counts the number of constraints violated by the assignment a = 
{(yi}i=i....,N (where (Ji £ {0, 1}). When a codeword a*, satisfying Hc[(J*] = 0, goes through a BEC, each of its bits 
(Ti has probability p to be erased. A given realization of the noise can be characterized by a vector ^ = ■ ■ ■ , Cn) 
with = 1 implying that the bit a* is lost, and S,i = that it is unaffected. If we denote by £ the set of indices i for 
which S^i ~ I (erased bits), the decoding task consists in reconstructing {cr*}i^£ from the received bits {cr*}i^f: and 
the knowledge of the encoding-CSP Hq- This decoding problem defines a new constraint satisfaction problem, the 
decoding- CSP, obtained from the encoding-CSP by fixing the values of the non-corrupted bits. More explicitly, the 
decoding-CSP has cost function [cr^^^] = Ea E'i\a(^^ where cr(«) = {crjief and 

4«^['^^«^]=E^«'^' + E^«'^* (^™d2). (16) 

Decoding is possible if and only if {a*}i(z£ is the only SAT-assignment of the decoding-CSP. 

II Nn[S,tC) denotes the number of solutions of the decoding-CSP, Sn{^,C) can be taken as Sn{CC) = \n JVn {£, , C) . 
This entropy fulfills the desired properties, namely Sn{£,,C) < if decoding is successful, and Sn{£,,C) > otherwise. 

The particularity of LDPC codes compared to other error-correcting codes is that the decoding-CSP has same form 
as the encoding-CSP (both are XORSAT problems). As a consequence, the Z2-symmetry of the group of codewords 
is always preserved, at variance with what happens in other CSPs where fixing variables breaks a symmetry. The 
BEC is also particular compared with other channels, since the set £ of corrupted bits is known to the receiver (this 
will not be the case with the BSC, where identifying the corrupted bits is part of the decoding problem). This entails 
that bits can only be fixed to their correct value. 

2. Cavity approach 

Before considering large deviations, it is instructive to recall the typical results, i.e. the values taken by S'Ar(^,C°) 
when C° is a typical code from a given ensemble specified by c{x) and v{x), and ^ a typical realization of the noise 
from the probability distribution specified by p. We resort here to the cavity method at zero temperature [14], whose 
validity is based on the tree-like structure of the factor graphs associated with typical LDPC codes. The essentially 
equivalent replica method has been used in the past: in [28], Sn{S.i C) is thus obtained by first computing a free energy 
with the replica method, and then taking the zero temperature limit to obtain Sn{S.,C), viewed as the entropy of the 
zero-energy ground states. 

The approach we follow here, which corresponds to a particular implementation of the entropic cavity method 
presented in [29], has several advantages over the replica approach: it involves neither a zero-replica limit nor a zero- 
temperature limit, it emphasizes the specificities of LDPC codes associated with the underlying Z2 symmetry, and 
it naturally connects to the algorithmic analysis of single codes. In the common language of the replica and cavity 
methods, the calculation to be done is coined one-step replica symmetry breaking (IRSB), and the entropy s = Sn/N 
is referred to as a complexity. This is reflected in what follows by the fact that we strictly restrict to SAT assignments 
and assume that all constraints are satisfied (the reweighting parameter fj,, as denoted in [25], is here infinite, /i = 00). 
This IRSB approach is known to exactly describe XORSAT problems [23, 24]. 

Let Pi {cTi ) be the probability, taken over the set of solutions of the decoding-CSP, that the bit i assumes the value 
Ui € {0, 1}. Due to the preservation of the Z2-symmetry, no bit can be non-trivially biased: cither it is fixed to or 1, 
corresponding to Pi ~ Sq and Pi = Si respectively, or it is completely free, corresponding to Pi = {So -\- <5i)/2, where 
we denote (5r((T) = Sr,a- In technical terms, the evanescent fields that are generically required to compute entropies 
in CSP [29] have here a trivial distribution, thus explaining that they can be safely ignored, as was done in [28]. 

Let V be the probability, taken over the N nodes of a typical factor graph, that a bit i is free, i.e. that Pi = {5q-\-5i)/2. 
Since a free node has equal probability to be or 1, its contribution to the entropy is In 2, and the mean entropic 
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contribution per node is vhi2. This value is however only an upper bound (known as the annealed, or first moment 
bound) on the entropy density s = Sn/N that we wish to calculate. In fact, it holds only if the bits are independent: 
indeed, two bits may both be free but, by fixing one, the second may be constrained to a unique value, in which case 
the joint entropic contribution of the two nodes is In 2 and not 2 In 2. The correct expression is given by the Bethe 
formula, which can be heuristically derived as follows. First, we sum the entropic contributions AS'o+ngo of each 
node o, including the corrections due to its adjacent parity-checks □ G o. Second, wc note that each parity-check □ 
is involved in fcg terms, with fcg being the connectivity of □. To count it only once, we therefore subtract (fcn — 1) 
times the entropic contribution A5n of each parity check □. This leads to 

^ = ^ (E ^^o+neo - - = (^^o+neo) - |y E ^'^(^ - ^)i^S^n) (17) 

where (A5o+neo) represents the average of ASo+aeo over the nodes o, and (AS^^) the average of ASq over the 
parity checks □ with connectivity kfj = k; the factor {£)/{k) accounts for the ratio of the number M of parity checks 
over the number N of nodes. 

To compute AS'o+Dgo, wc need to know whether the bits of the nodes adjacent to o arc fixed or not, in the absence 
of the "cavity node" o. As the cavity node is connected to its neighbors through parity checks (see Fig. 4(a)), we 
can decompose the computation in two steps. First, we observe that a given neighboring parity check constrains the 
value of the cavity node if and only if all the other nodes to which it is connected have themselves their bit fixed in 
the absence of the cavity node. Denoting by C the probability of this event, and by rj the probability for a node to be 
free in the absence of one of its adjacent parity check, we thus have 

where kck/{k) is the probability for a parity check be connected to fc — 1 nodes in addition to the cavity node (see 
Fig. 4(a)) and 1 — (1 — 77)'''^^ is the probability that at least one of these k — 1 nodes is free in the absence of the 
parity check. Next, we observe that the probability for the cavity node to be free is the probability that none of its 
adjacent parity checks is constraining, that is 

e 

In order to close the equations, we also need the probability for the cavity node to be free in the absence of one of 
its connected parity check (see Fig. 4 (c)), which is 

where ive/{i) represents the probability for a node to be connected to ^ — 1 parity checks in addition to the one 
ignored. The "cavity fields" 77 and determined by (18) and (20), contain all the information needed to evaluate the 
entropy. Thus (ASo+ogo) is given by 

(A^o+neo) = (ln2)[pz;(C)- WC]. (21) 
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FIG. 5: Reduced entropy vs. noise level p for an LDPC code with k — 6 and £ = 3. When p — 0.4 < pd (left inset), = is 
the only solution to the cavity equation (24), yielding s — 0. When p — 0.48 > pd (right inset), two more solutions appear, one 
of which is stable. The entropy of this solution crosses zero at the critical noise pc, above which the entropy become strictly 
positive, causing failure of decoding. 



The first term, (ln2)pw(^) corresponds to (ln2)i', see Eq. (19), the average entropic contribution of a node o, and 
the second, —{\n2){£)(, subtracts the entropic reductions of its adjacent parity-check nodes; indeed they are {£) in 
average, and each is constraining the cavity node with probability C. Similarly, the average entropic reduction due to 
a parity-check alone is 



)=-(hi2) [l-(l-r7)* 



(22) 



since 1 — (1 — ry)''' is the probability that at least one of the k connected nodes is free in the absence of the parity 
check (see Fig. 4 (b)). To sum up, the entropy is determined by the formulae 



(In 2) 



pv 1 



{k) 



1^ (1 - c(l - 77) - ?/c'(l - 77)) 



(23) 



Eq. (24) can admit two kinds of solution (see Fig. 5). The first kind, referred to as ferromagnetic, describes the 
situation where decoding is possible, with only one codeword being solution of the decoding-CSP: this solution has 
77 = (all bits are fixed to cr*) and s = 0. The second kind, referred to as paramagnetic (but strictly speaking 
corresponding to a IRSB glassy solution) describes the situation where decoding is impossible, and has 77 > 0. It is 
found to exist only for p greater than the so-called dynamical threshold, denoted by pd- It is however relevant only 
when associated with a positive entropy, s > 0, a condition which defines the static threshold, denoted by pc and 
satisfying Pc > Pd- The static threshold corresponds to the threshold above which decoding is doomed to fail, as 
confirmed by rigorous studies. 



3. Algorithmic interpretation 

The cavity method is related to a particular decoding algorithm known as belief propagation (BP). Its principle 
is the following: starting from a configuration where only the non-corrupted bits are fixed to their values, one goes 
through each node of the factor graph, checks if its immediate neighboring environment constrains it to a unique value, 
fixes it to this value if it is the case, and iterates the whole procedure until convergence. At the end, some bits may 
still not be fixed, which certainly occurs if the decoding-CSP has not a unique solution, but if all the bits end up fixed, 
one is insured to have correctly decoded. Similar message-passing algorithms can be defined with different channels. 
They are responsible for the practical interest of LDPC codes as they provide algorithmically efficient decoding (yet 
suboptimal, as discussed below). With the EEC, these algorithms are particularly easy to analyze thanks to the fact 
that one can never be fooled by fixing bits to an incorrect value. To perform the analysis of the possible outcomes of 
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the belief propagation algorithm, wc can assume without loss of generality that the transmitted message is (0, . . . , 0) 
(the Z2-symmetry implies that all codewords are equivalent). We thus start with ai = * if i € £ , and ai = otherwise. 
Cavity fields are attributed to each oriented link of the factor graphs and are updated with the following rules, where 
t indexes iteration steps. 



h 



(t+i) _ J if (T; = or if = 1 for some b E i — a 
1 * otherwise 

it) (2^) 
(t+i) _ I 1 if = for all j G a — z 

I * otherwise 

Here, — 1 (rcsp. *) means that the parity check a is constraining (resp. is not constraining) i. h!f}_^^ = (rcsp. 
*) means that ai is fixed (resp. not determined) to its correct value without taking into account the constraints due 
to a. The algorithm is analyzed statistically by introducing 



= 7^ E ^^a. 0), C'*' ^ ^ E ^(-il. 1). (26) 



As suggested by our notations, the evolution for these quantities exactly mimics the derivation of the formula; for 
the cavity fields, yielding 

The fixed point is given by Eq. (24). When p < pd, the algorithm converges towards the unique, ferromagnetic, 
fixed point 77^°°^ = = 0, and decoding is successfully achieved. When pd < p < Pc, a paramagnetic fixed point 
appears in addition to the ferromagnetic fixed point and the iteration leads to this second paramagnetic fixed point. 
The belief propagation algorithm thus fails to decode above the dynamical threshold pd, before reaching the static 
threshold pc below which no algorithm can possibly be successful (in this sense, BP is suboptimal). 



B. Average error exponents 

1. Entropic (IRSB) large deviations 

The previous section recalled the properties of typical codes subject to typical noise. With finite codewords, N < 00, 
failure to decode may also be due to atypical noise with unusually destructive effects. This is the purpose of our large 
deviation approach to investigate such events. We first focus on the simplest case, namely the computation of the 
average error exponent where both the codes C and the noise ^ are treated on the same footing (see Sec. HE). Our 
procedure to deal with the statistics over atypical factor graphs is an application of the cavity method for large 
deviations proposed in [15]. For the sake of simplicity, we restrain here to regular codes, where nodes and check nodes 
have both fixed connectivity, £ and k respectively, and defer the generalization to irregular codes to Appendix D. 

As explained in HE, the thermodynamic formalism assigns a Boltzmann weight e'^'^™^'''^-' to each "configuration" 
(C, ^). The parameter x plays the role of an inverse temperature or, in other words, is a Lagrange multiplier enforcing 
the value of Sjy. Taking the infinite temperature limit a; = (no constraint on the value of Sn) will thus lead us back 
to the typical case discussed above. 

The cavity equations are as before derived by considering the effect of the addition of a node. As adding a new 
node, along with its adjacent parity checks, inevitably increases the degrees of the other nodes, strictly restraining 
to regular graphs is not possible and we must work in a larger framework. Accordingly, we consider ensembles where 
the degree of parity checks is fixed to k, but where the degree of nodes has a distribution {vl} (meaning that degree 
L has probability vl, independently for each node). We will describe the regular ensemble by taking vl = 5g^L hi 
the final formulae. Adding a new node with £ parity-checks brings us from an ensemble characterized by to an 
ensemble characterized by v'^, with 

/ t{k~l)\ l{k~l) ^(fc-l)c 
= (1 - j + ^^-L-^ = V, + -^^Sv, (28) 
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where 6vl = vl-i — vl, since £{k — 1) nodes have their degree increased by one. Let denote by L{s,{vl}) the rate 
function for the probabihty to observe Spf/N = s in an ensemble characterized by {wl}, that is 

Pjv[(C,0 : SNiC,0/N - s I {vl}] x g-^^^^'^"-". (29) 

We introduce P^^^^^^{AS) , the probabihty distribution of the entropy contribution caused by the addition of the new 
nodes along with its £ adjacent parity-checks. The passage from N nodes to + 1 nodes can then be described by 

Pjv+i(.s - S/iN + 1)|K}) - e-(^+i)i(s/(^+i)^{-^}) 

= Y.vJ dASP^2Deo(^S)PN[s = {S-AS)/N\{vL-i{k^l)/NSvL}] 

/ dA5P„';'ng,(A5)e-^^[(^-^^)/^'{''--^('=-i)/^^"->]. 

Expanding for large N ^ one gets 

0.(x)=x5-L(s,K}) = ln^z;, / dA5P„%^JA5) e^^^+^^^'^-i) (31) 

with 

The parameter z is determined by noting that the addition of a new parity-check changes the node degree distribution 
in the same way as in Eq. (28), with v'j^ = vl + {k/N)SvL, yielding 

^-NL{S/N.{vl}} ^ f d^^p^(^5')g-JVL[(S-AS)/Ar,{t,z,-(fe/Ar)6t,i}]^ ^33^ 



where Pn(AS') is the probability of the entropy reduction caused by the addition of a new parity-check. Expanding 
here also for large N leads to an equation for z, 

2; = In J dAS'Pn(A5) e"^^^. (34) 

Following the same line of reasoning as in the typical case, the two distributions P^_^^^^ and Pa can be expressed 
by means of cavity fields 77 and C. First consider the addition of a node: If the bit of the new node is fixed, either 
because it was not erased or because one its adjacent parity-check constrains it, there is an entropic reduction — In 2 
per non constraining adjacent parity-check, and thus a weight . Otherwise, if the new node is free, which occurs 
with probability p(^, the entropy shift is (ln2)(l — £), giving a weig ht 2^(1-^). Taking vl = 5l,i, Eq. (31) therefore 
reads 

<t>s{x) = In [(C2-- + 1 - C)' -P(C2-")' +K'2"^^"'^] + e{k - l)z, (35) 

with 

C=l-(l-^/)'~'- (36) 

Similarly, a new parity-check removes a degree of freedom if and only if one of its adjacent node is free, which happens 
with probability 1 — (1 — 77)*^, yielding 

z = -i In [1 - (1 - (1 - v)') + (1 - (1 - V f)2-^] ■ (37) 

Finally, we obtain a self-consistent equation for 77 by considering the addition of a new (cavity) node in the absence 
of one of its adjacent parity-checks: 

r; = P(cavity node free) cx J dAS'Po^n(AS'|cavity node free)e^'^^+^(^-i)('=-i\ (38) 
l-V ^ P(cavity node fixed) (X J dA5Po^n(A5|cavity node fixed)e^'^^+^(^-i)(''-i) (39) 
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FIG. 6: Rate function I/(s) as a function of the entropy s, here iUustrated with a regular code with fc = 6 and £ = 3 (for the 
BEG channel). The three regimes are represented, (a) p = 0.2 < Pirsb: the spinodal of the paramagnetic solution is for Sd > 0. 
(b) p = 0.35 £ [piRSB, Pd]: the spinodal is now for Sd < 0. (c) p = 0.45 £ [pd,Pc]'- the spinodal is preceded by a minimum (the 
typical value), with Xd = 9sl/(s = Sd) < 0. The typical dynamical and static transitions can be read on the s = axis: by 
definition of pd and pc, this equation has a solution s for p > pd, and this solution is positive, s > 0, for p > pc (not represented 
here) . 



{k,l) 


PlRSB 


PRS 


Pe 


Pd 


Pc 


(4,3) 


0.3252629709 


0.5465748811 


0.6068720166 


0.6474256494 


0.7460097025 


(6,3) 


0.2668568754 


0.3378374641 


0.3491884902 


0.4294398144 


0.4881508842 


(6,5) 


0.01300820524 


0.4277010368 


0.7143657513 


0.5510035344 


0.8333153204 


(10,5) 


0.04412884546 


0.2435656894 


0.3347721176 


0.3415500230 


0.4994907179 



TABLE III: Values of some thresholds pirsb, Prs, Pe, Pd and pc for different regular ensembles of LDPG codes on the BEG. 



where Po-fD corresponds to ^o+Qgo, taken either under the condition that the cavity node is free, or that it is fixed. 
We obtain: 

, = p^^'ic^-n'-' (40) 

((2-^"^ + 1 - Cf^^ + p(2^ - 1) (C2-^)^"^ 

Alternatively, these equations can be obtained by differentiation of Eq. (35), which is variational with respect to the 
cavity r/. The large deviation cavity equations (36) and (40) allow us to compute the generating function 4>six) using 
Eq. (35) and (37), from which the rate function L{s\{vi = 5i,i}) is deduced by Legendre transformation as discussed 
in HE. 

Again, two kinds of solutions, paramagnetic or ferromagnetic, can be present. For a given value of p, wc find that a 
non-trivial, paramagnetic solution to Eq. (40) exists only for x > Xd{p)- In agreement with the observation reported 
in the previous section that the paramagnetic solution typically exists only when p < pd, wc have Xd{p) < ior p > pd 
and Xdip) > for p < Pd (the typical case is indeed associated with x = 0). We obtain the average error exponent by 
selecting the value of L{s) where s = 0: our results are illustrated in Fig 6. By extension of the concept of dynamical 
threshold pd, one could define a "dynamical" error exponent as Ed(p) = L{xd{p)) = Xd{p)s{xd{p)) — 4>(xd{p)) with 
Xd{p) corresponding to the temperature of the spinodal for the paramagnetic solution. The relevance of this concept 
is however limited by the fact that the algorithmic interpretation presented in Sec. Ill A 3 does not extend to large 
deviations (see also Sec. IIIC3). 

More interestingly, we find an additional threshold, denoted pirsb, below which the equation s{x) = has no longer 
a solution (see Fig. 6). This inconsistency of the IRSB solution is indicative of the presence of a phase transition 
occurring at some p^ > Pirsb- The following section is devoted to computing p^, and describing the nature of the 
new phase present for p < pe- 



2. Energetic (RS) large deviations 



The previous "entropic (IRSB) approach" attributed errors to the presence of an exponential number of solutions in 
the decoding-CSP. The same assumption was underlying the analysis of the typical case, in Sec. Ill A 2, where rigorous 
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FIG. 7; Average error exponent as a function of the noise level p for the regular code ensemble with k — 6 and = 3, on the 
BEG. Numerical estimates of the error probability, based on 10^ runs of exact Maximum Likelihood decoding (using Gauss 
elimination) on samples of sizes ranging from A'' = 500 to A'^ = 1500, yield reasonably good estimates of the error exponent 
using an exponential fit. These numerical results agree well with our theoretical prediction. The union bound (Gil) and the 
random linear limit (62) are also represented for comparison. 



studies support the conclusions drawn from this hypothesis. This view is also consistent with the phase diagram of 
XORSAT problems to which the encoding-CSP belongs. The structure of the well separated codewords corresponds in 
this context to a "frozen IRSB glassy" phase. As p departs from the value p = 1 however, the decoding-CSP deviates 
increasingly in nature from the initial encoding-CSP. As the number of constraints increases (as p decreases), the 
presence of an exponential number of solutions (glassy phase) in addition to the isolated correct codeword becomes 
less and less probable. An alternative rare event possibly dominating the probability of error at low p is the presence 
of a second isolated (ferromagnetic) codeword close to the correct one. This can lead to a new phase transition that 
has no counterpart in the typical phase diagram, reflected by a non-analyticity of the error exponent. 

In our framework, investigating an alternative source of error requires considering for Sjy an other quantity than 
the entropy of the number of solutions. A possible choice, associated with a replica symmetric (RS) Ansatz, is the 
energy Ejy of the ground-state of the decoding-CSP, giving the minimal number of violated parity checks. Ignoring the 
correct codeword, a second isolated codeword is present if and only if iJ^r = (otherwise En > 0). Large deviations 
of this energy are described by the rate function Li(c) defined as 



P[tC:EN{^,C)/N^e] 
The generating function for the rate function Li(c), defined by 



-iVLi(e) 



(41) 



(42) 



is given by (see [24] for a similar calculation) 



(^1 (x) = In 



P / n dt.,Q(i.,)c--(SLil«"l-li:Lr""l) + (l-p) / II duaQMc-^-^'-'-^ 

•' ".= 1 a=l 

In j n dh, P{h,)e-''^ 



l{k-l) 



(43) 
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with 



PQI^+OO) p Y[ dUaQKK^(^"='l ^»='l /l-^^^a 

a=l \ a=l y 

P(/i = +oo) cx (l-p)y n duaOK)e-"^-i*--i 



(44) 
(45) 
(46) 



where S{x) = 1 if x > 0, —1 if x < 0, and if x = 0. Since u only takes values in {—1,0, +1}, and h is restrained to 
integer values, we can introduce 



Q{u) = q+S{u - 1) + q-S{u + 1) + qoS{u), 



(47) 



and 



p+ = / dhP{h) P-= [ dhP 

Jh>Q Jh<Q 



(h) 



(48) 



Our interest is here in zero-energy ground states, described by the limit a; ^ oo where the equations simplify to: 



with 



= -L{e = 0) = In [(1 -q_Y+ p{l q+f - pq',] - In 



l-\{{p++P-f-{p+-p^f) 



(49) 

(50) 
(51) 
(52) 

(53) 

(54) 
(55) 

We find that the only stable solution to these cavity equations satisfies go = Po = 0, which allows us to further simplify 
the formula:;, 



p+ 


cx 




-pQo ^ 




V- 


oc 


P{1 - q+Y~' 


-Mo"\ 




Po 


cx 


P4r\ 






q+ 




\ [{P++P-) 


+ iP+ 


-P^Y-' 


q- 




\ [{P++P^) 


iP+ 


-P^Y-' 






1 - (P+ + P- 







M+oo) = In [g^ + p{l q+Y] - In 



(1 + {2p+ If) 



with 



9+ 



q'.-'+pil^q+y-^' 
i[l + (2p+- 1)^-1]. 



(56) 

(57) 
(58) 



The resulting RS average error exponent, given by Ee{p) ~ — 0(+c«). is represented in Fig. 7. 

We identify the transition pg as the point where the IRSB and RS error exponents coincide, which satisfies Pe > 
PiRSB- We find that the RS solution is limited by a spinodal point and is only defined for p > pns- While we 
conjecture that the IRSB estimate is exact for p > pe, the existence of p^s suggests that either an additional phase 
transition occurs at some p'^ > Prs^ or, more radically, that our description of the phase p < Pe is incorrect. The 
limit case of random codes however indicates that the energetic method is valid in the limit fc, £ — > oo. 
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3. Limit of random codes 

The only limiting case where the average error exponent has been obtained integrally so far is the fully connected 
limit where k,£ oo with £/k = a — 1 ^ R fixed. This limit corresponds to the random linear model (RLM), where 
each parity-check is connected to each node with probability 1/2. In this limit, the entropic IRSB approach gives 

Esik,e ^ oo) ^ L{s = 0) = D{1- R\\p), (59) 

where D{q\ \p) = q \n{q/p) + (1 — q) ln((l — q)/{l—p)) is known as the KuUback-Leibler divergence, while the energetic 
RS approach gives 

Ee{k,£ ^oo) = -0e(+oo) = - (i? - 1 ) In 2 - ln(l + p) , (60) 
(with = 1/1 + p and q^ = 1/2). The two expression coincide at the critical noise pe, with 

Pe = il-R)/il + R). (61) 
We thus predict the average error exponent of the RLM to be 



(l-i?)ln2-ln(l+p) ifp< 
D{l-R\\p) if i_| <p< 



This result coincides with the exact expression (see Appendix B for a direct combinatorial derivation) , thus validating 
our approach in this particular case. 

As explained above, we are not able to fully account for the small noise regime as soon as k and £ are finite, 
even though the solutions are found to be stable with respect to further replica symmetry breakings in the space 
of codewords [30]. This docs not exclude that a similar replica symmetry breaking occurs in the space of codes. 
Remarkably, previous attempts reported in the literature have also failed to obtain error exponents in the low p 
regime. 



C. Typical error exponents 

1. Cavity equations 



The typical error exponent is encoded into a potential il}{x,y), as defined in Eq. (13). The equations for ip{x,y) 
are obtained from the cavity method for large deviations by following very closely the path leading to (f>{x) [31]. As 
noticed in Sec. II, the formalism with finite y provides a generalization of the average case which is recovered by 
taking y = 1, with ip{x, y = 1) = (j){x). We will therefore only quote our results. In the entropic (IRSB) case, we find 



^,(x, y) = In [{C2-^y + 1~CY~ iC2~^'Y + C\p2'' + 1 - p)^2-^-^] - ^-^^-^ In [(1 - ry)'= + (1 - (1 - fi)')2-^y] 



with 



((2-^^ + 1 - c)'^"^ - {C2-'-'yy-^ + C^-i(p2^ + 1 -p)y2-(^-i)^!'' 
C = i-{i-7^Y-\ 

In the energetic (RS) case with x = +00, we find 



(63) 



(64) 



V'e(x = +oo,y) = ln [g^+p?'(l-g+)^] -^(L-illn 



-(l + (2p+-l) 



with 



9+ 



q'.-'+pyil-q+Y-^' 
1 [1 + (2PH. - 1)'^-^] . 



(65) 

(66) 
(67) 
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FIG. 8: Rate function £{Le) = C[—(j)e{+oo)] of the energetic error exponent for an LDPC code with k = 24, ^ = 12 on the 
BEG. When p > py (fuU curve), the rate function is negative (and therefore unphysical) for all < y < 1, entailing that the 
typical and average error exponent should coincide. When p < Py (dashed curve) , we postulate that the typical error exponent 
is given by the inverse "freezing temperature" j/c at which the rate function cancels. 



In each case, from the potential ^j{x, y), the rate function is obtained as £(0, x) ~ y(l) — Tjj{x,y), with (t>{x) ~ dyipi^x, y). 
By definition, a typical code corresponds to a minimum of C, with £ = 0, which, when C is analytical at this minimum, 
is associated with y = d,j,L = 0. 

As a generic feature, we find that C{y,x) is an increasing function of y for fixed x, going from negative values 
for y < yc{x) to positive ones for y > ydx). Negative rate functions, as thus obtained, are certainly unphysical. 
As negative entropies in the usual cavity /replica method, we attribute them to analytical continuations of physical 
solutions. The simplest way to circumvent them is, as with the frozen IRSB Ansatz in the replica method, to select 
ydx) with C{y,x) = 0. When yc{x) < 1, meaning that C{y = \,x) > 0, we consider that the average exponent is 
associated with atypical codes and therefore differs from the typical exponent, described by C{yc{x), x) = 0. Using 
this criterion, we find that the two exponents indeed differ for the lowest values of p, when p < py, where py < pe 
(see Fig. 8 for an illustration). In general the situation is complicated by the fact that the cavity equations may fail 
to provide solutions in this regime, as already seen in the average case when p < p^s (corresponding here to y = 1); 
the random code limit, where this complication is absent, is thus the most instructive. 



In the limit k, £ oo, we obtain the following results. In the entropic regime, p > Pe, the average and typical 
exponents are found to coincide. This conclusion extends in the energetic regime only for a restricted interval [py,pe\. 
When p < Py, we have ydx) < 1 and average and typical error exponents differ. The formula we obtain for the typical 
error exponent reads 



2. Limit of random codes 




(68) 



with 



ScviR) 



(69) 



Py = 



l^SaviR)' 



6gv{R) denotes the smallest solution to (i? — 1) ln2 + H{6) = 0, whose interpretation is discussed in Appendix B. 
This result, which does not seem to have been reported previously in the literature, coincides with the union bound 
presented in Appendix C, which strongly suggests that it is indeed exact. 



17 



For LDPC with finite connectivity, a similar phase diagram is expected. In the entropic regime, we find indeed that 
average and typical exponents are identical. In the energetic regime, we face the problem that the cavity equations 
have no solution below some value of p, which precludes us from estimating py. 



3. Algorithmic implications 

The cavity formalism has the attractive property of corresponding formally to message passing algorithms. Based 
on this analogy, new algorithmic procedures have been systematically proposed to analyze single finite graphs, each 
time the cavity approach was found to operate at the ensemble level. With a phase transition occurring at the 
ensemble level, we have however here a system where such a correspondence is no longer meaningful. Following the 
usual procedure, it is indeed straightforward to implement the cavity approach for average error exponent on a single 
graph, but in the regime p < py, this algorithm is doomed to fail: for any typical graph, in the limit of large size, the 
message passing algorithm will yield the average error exponent, which, as we have seen, is distinct for the correct, 
typical, error exponent. 



IV. LDPC CODES OVER THE BSC 



A. Definition 



We now turn to error exponents for LDPC codes on the binary symmetric channels (BSC). One motivation for 
repeating the analysis with this channel is that it is representative of a broader class of channels, where bits are not 
simply erased as with the BEC, but can be corrupted, in the sense that their content or 1 is changed to other 
admissible values. This clearly complicates the decoding as corrupted bits can not be straightforwardly identified; in 
fact, with the BSC, no scheme can guarantee to identify the corrupted bits, and the receiver is never certain that 
his decoding is correct. We will however see that the overall phase diagram is very similar to that obtained with the 
BEC. 

By definition, maximum-likelihood decoding consists in inferring the most probable realization of the noise a 
posteriori. The a posteriori probability can be expressed from the a priori probability thanks to Bayes' theorem. If 
X denotes the transmitted message and y the received message, the a priori probability to receive y given x is 

N 

Q(y|x) = n(i~p)'°"""p'"'^'""- (70) 

To make contact with physical models of disordered systems [12], it is convenient to adopt a spin convention: at = 
(— 1)^', Ti = (—1)^', and to rewrite the previous relation as 

Q(*T|T)cxe^"i'*-^', h, = hoa,, ho=^ln(^—^) . (71) 



P 



This formulation emphasizes the analogy with the random field Ising model [32], a prototypical disordered system. 
Using the group symmetry of the set of codewords, we can assume, without loss of generality, that the sent codeword 
is <T = (+1, . . . , +1). With this simplification, the random field takes value hi = ho with probability 1 — p and —ho 
with probability p. Bayes' formula for the a posteriori probability that the message r was sent reads 

where is a shorthand for Hjg^Ti: in the present spin convention, the constraint induced by the parity-check a 
indeed reads = 1. To continue the analogy with statistical mechanics, we have also introduced a temperature /?, 
called the decoding temperature, whose value is here fixed to /3 = 1 (Nishimori temperature — see [11]). Given the 
a posteriori probability, the selection of the most probable codeword d(cr) can still be done according to different 
criteria, amongst which: 

• Word maximum a posteriori (word-MAP), where one maximizes the posterior probability in block by taking 
dbiock(o') = argmaxT-P(T]cr). This scheme minimizes the block-error probability. Pbiock ~ / ^I) J2t ^I'^i'^) 7^ 
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• Symbol maximum a posteriori (symbol-MAP), where one maximizes the posterior probabiHty bit per bit 
by taking dbit(c)i = argmax^. P(t|(t). This scheme minimizes the bit-error probability Pbit = 

In physical terms, the word-MAP procedure consists in finding the ground state of the system with partition function 
Z{j3) given by the normalization in Eq. (72); this amounts to studying the zero-temperature limit. f3 oo. Conversely, 
symbol-MAP is equivalent to taking the sign of the local magnetizations at temperature /? = 1, 



=sign((T,)) =sign 



(73) 



We will treat the two cases in a common framework by considering an arbitrary temperature (3 >1. 

From the physical perspective, the original codeword is recovered if it dominates the Gibbs measure defined in 
Eq. (72). This can be expressed by decomposing the partition function Z{j3) as 

^(/3)=^corr(/3)+^orr(/?), Zeorr (/?) = 6^ ^ > , ^orr (/?) = ^ ' 11 ^(^- " 1) ' (^4) 

T^l a 

We define the corresponding free energies, Fcorr(/3) = — (1//?) In .^corr(/?) and Fcrr(/3) = — (1//3) lnZcrr(/3). The first 
one corresponds physically to a ferromagnetic phase (as with the BEG), while the second will be shown to correspond 
either to a paramagnetic or a glassy phase, depending on the values of /? and p. Decoding is successful if, and only 
if, the ferromagnetic phase has lower free energy, Pcorr < ^orr- The quantity Sjy introduced in section HE can 
therefore be defined here as 

Sn ^ F,,,M ~ F,„iP) (75) 
where the dependence in the noise ^ and the code C is implicitly understood. 



B. Cavity analysis and the IRSB frozen ansatz 



As with the BEG, explicit calculations can be performed by means of the replica or cavity methods. Details can 
be found in Appendix E and we only discuss here the points where differences with the BEG arise. For any fixed p, 
a replica symmetric (RS) calculation, whose derivation follows the derivation of the paramagnetic solution with the 
BEG, is found to undergo an entropy crisis, i.e., srs(/3) = /3^9^/rs(/3) < for (3 > Pg. This feature is indicative of 
the presence of a glassy phase, and points to the need to break the replica symmetry. The glassy phase of LDPG 
codes is however of the "frozen IRSB" type, which implies that the glassy free energy /cir can be completely inferred 
from the replica symmetric solution /rs. This simplicity stems from the "hard" nature of the constraints: changing a 
bit automatically violates all its surrounding checks, forcing the rearrangement of many variables [33, 34]. When the 
degree of all nodes is ii > 2, one can indeed show [24] that changing one bit while keeping all checks satisfied requires 
the rearrangement of an extensive (oc N) number of variables (in the language of [24], factor graphs of LDPG codes 
have no leaves). The consequence, expressed in the replica language, is that the IRSB "states" are reduced to single 
configurations, and thus have zero internal entropy. The IRSB potential 0(/3,to) whose optimization over m G [0, 1] 
is predicted to yield far [20] thus simplifies to (f){P,m) — /rs(/5"^) [35], since 



^^NPme^ ^ g-/3m/Rs(/3m)_ 



states OL Oi 

According to whether one is above or below the freezing temperature defined by 

srs(/3<,) =/3'5^/rs(/3<,) =0, 



(76) 



(77) 



the free energy fcniP) is given either by /rs(/3) (paramagnetic phase), or by /Rs(/3g) (glassy phase). This is summa- 
rized as follows: 



/„,(/?)= max /rs(/3') 



/rs(/3) ff/3</3g. 



(78) 



Finally, we note that as in the BEG case, a non-ferromagnetic solution /rs(/3) exists only for large enough p. 
The threshold Pd{P) giving the smallest noise level at which a non- ferromagnetic solution exists is again called the 
dynamical threshold, and can be shown here also to coincide with the dynamical arrest of BP [28]. 
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C. Average error exponent 

1. LDPC codes 

In the region relevant for error exponents, where p < pc and /? > 1 , the ferromagnetic solution is typically dominant 
(this is the definition of p < pc), and metastable phases described by /err are typically glassy, since f3g < 1. Therefore, 
to compute error exponents, we have to consider ferriP) = fusiPg), and not /orr(/3) = /rs(/3)- This leads us to 
introduce an extra temperature (3^ distinct from the decoding temperature which is to be set to f3g by requiring 
that the entropy srs is zero. Similarly, we introduce a ferromagnetic temperature set to f3f = /3, and define the 
rate function -Li(/e, //) and its Lcgcndre transform as 



The potential (jji contains all the necessary information about both solutions: 

-Pafa^dxAl^ Sq = - — 5/3>i, (80) 

where the index a ^ e, f corresponds to the two possible phases. To the purpose of computing error exponents, we 
need only to control /p — fj and s^, for all temperatures /3e < (3. Note that the ferromagnetic solution ff has no 
entropy, s/ = 0, which is here reflected by the fact that the potential (jji depends upon Pf and Xf only through 
ruf = PfXf. These observations allow us to focus on a simplified potential 

(f){Pe,rn) = (1)1 (^Pe,Xe = ^,rnf = -m^ (81) 

which satisfies: 

dm<P^.ff-fe, df3^$^~mSe. (82) 

As with the BEC, the average error exponent is identified with the smallest value of Li such that Se > and 
ff — /e > 0. The present formulation is in fact equivalent to the presentation based on the replica method given 
in [10]. A remarkable consequence of the analysis is that the average error exponent is predicted to be the same for 
any /3 > 1. Indeed, both the glassy and the ferromagnetic free energies are temperature- independent for (3 > (3g. In 
particular, symbol and word-MAP are predicted to have same error exponents. 

Based on the cavity equations given in Appendix E, the potential (j) can be computed numerically by population 
dynamics. As an illustration, we plot in Fig. 9 the rate function Li{ff — /g, Se = 0) for a regular code with k = 6, 
£ = 3. As in the case of BEC, three regimes can be distinguished, according to the value of p: 

• p < piRSB^ no zero-entropy RS solution typically exists, and /e < ff for the metastable solutions. 

• PiRSB < P < p'd- no zero-entropy RS solution typically exists but the dominant metastable solutions have 

fe > ff- 

• p'j^ < p < Pc'. a zero-entropy RS solution is typically present. 

The major difference with the BEC is that the threshold p'^, defined by p'^ = PdiPgip'^)) does not coincide with the 
dynamical threshold PdiP)- indeed here p'^ is defined in relation to the existence of a solution with positive entropy, 
while, in the framework of BP, the dynamical arrest pd is related to the existence of a paramagnetic solution at 
decoding temperature /3~^ [28]. In Fig. 10, we plot the average error exponent for regular codes with fc = 6, ^ = 3. 

D. The random code limit 

1. Average error exponent 

As with the BEC, the fc,£ — > oo limit can be computed exactly, yielding 

e['^ = Liiff = f,,se = 0) = D{Sgv{R)\\p), (83) 
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FIG. 9; Large deviation rate — /e,Se = 0) as a function of the difference between tfie ferromagnetic and the non- 

ferromagnetic free energies, here for regular codes with fc = 6 and ^ = 3 on the BSC. The thresholds are pirsb ~ 0.058 and 
Pc ~ 0.100. The three regimes are represented. From left to right: p = 0.045, p = 0.07 and p = 0.09. 
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FIG. 10: Average error exponent as a function of the noise level p for the regular code ensemble with k — 6 and 1 = 3 through 
the BSC. Here pirsb ~ 0.058. The union bound (C17) and the random linear model {k, I oo) limit (B14) are also represented 
for comparison. 



where Sgv{R) denotes the smallest solution to i? — 1 + H{S) = 0. In this regime, errors are most likely to be caused 
by large noises driving the received message beyond the typical nearest-codeword distance. 

As pointed out in [10]. a second ferromagnetic solution is present in this limit (see Appendix E for details), yielding 
the error exponent: 



1 



= - In 2 + P)j -i?ln2. (84) 

Such a solution also exists for finite fc,^, but is clearly unphysical (it predicts negative exponents for fc = 6, € = 3). 
Yet it correctly describes the low p phase (B14) in the k,£ ^ oo limit, where failure is caused by the existence of 
one (or a few) unusually close codewords. In that sense it plays the same role as the energetic solution in the BEC 
analysis, with the difference that it is not extensible to any case with finite connectivities. The critical noise Pe below 
which such a scenario occurs is given by: 



/Pe 



We thus predict the average error exponent to be: 



^ScviR)- (85) 



p^^,TA^^ i D{Sgv{R)\\p) iip<Pe<Pc, 

^^^^^'')= -lni(l + 2VKr^)-i?ln2 if p < p.. ^''^ 



This expression coincides with the exact result (B14) of the RLM. 
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FIG. 11: Rate function C{L) for the RLM on the BSC with R = 1/2 and p = 0.005 > py (full curve), p = 0.001 < py (dashed 
curve) . 



2. Typical error exponent 



The typical exponent of the RLM can be evaluated using the two-step potential: 



E, 



c c 



d0 e 



N{y4,-C{4>,P,,m)) 



(87) 



The details of the calculations by the cavity method arc given in Appendix E. As in the average case, two distinct 
solutions appear. The first one is the counterpart of the solution discussed in section IV C. It yields, in the random 
linear limit: 

ip{Pe,m,y) = y4){(3e,m). (88) 

A consequence of the linear dependence on y is that always takes the value obtained from the average calculation, 
irrespectively of y. Therefore, the average and typical error exponents coincide in this regime, and are given by (83). 

This solution is however only valid in the high noise regime {p > Pe). As in the average case, for low p, the errors 
in decoding are dominated by the presence of a sub-exponential (zero entropy) number of close codewords. The 
associated solution has for potential 



V'(y) =~yL^C = (i?-l)ln2 + ln[l+ (2^/p{l-p)'^ 



(89) 



We observe two types of behavior according to the value of p: for py < p < pe, C{y) is negative for < y < 1, whereas 
for p < Py, it crosses at j/c < 1 (see Fig. 11). Interpreting, as in the BEC analysis (see section III CI), negative 
values of C as the evidence of a glassy transition in the space of codes, we deduce that the typical error exponent is 
given by i(j/c) when j/^ < Ij in which case it differs from the average error exponent. To sum up: 



£;o(RLM) = 



HVc) = -SGviR)^'^ 2y/p{l-p) ifp<Py, 



L{y = 1) = EiiRLM) 
where the critical noise py (R) is solution of: 



lipy <P < Pc, 



l + 2^Py{l-Py) 



(R). 



(90) 



(91) 



This exponent coincides with the RLM limit of the union bound (C18), and is rigorously established [7] to be the 
correct typical error exponent on the BSC. 



V. CONCLUSION 



Since Shannon laid the basis for information theory, the analysis of error-correcting codes has been a major subject 
of study in this field of science [4]. Error-correcting codes aim at reconstructing signals altered by noise. Their 
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performance is measured by their error probability, i.e. the probabihty that they fail in accomplishing this task. For 
block codes, where the messages are taken from a set of 2*^ codewords of length N, it is known that when the rate 
R ~ M/N is below the channel capacity _Rc, the probability of error behaves, in the limit of large N , at best, as 
Pe ~ exp{—NE{R)) [4]. This error exponent E{R), also called reliability function, provides a particularly concise 
characterization of performance. 

For a given code ensemble, two classes of error exponents can generally be distinguished, due to the presence of 
two levels of "disorder" , one associated with the choice of the code itself, and a second associated with the realization 
of the noise. Average error exponents correspond to take the error probability with respect to these two levels 
simultaneously, while typical error exponents refer to fixed, typical, codes. 

In the present paper, we tackled the computation of these two error exponents for a particular class of block codes, 
the low-density parity-check (LDPC) codes, with two particular channels, the binary erasure channel (BEC) and the 
binary symmetric channel (BSC). We considered decoding under maximum-likelihood decoding, the best conceivable 
decoding procedure. We framed the problem in terms of large deviations, and applied a recently proposed extension of 
the cavity method designed to probe atypical events in systems defined on random graphs [15]. This method provides 
an alternative to the replica method used in [10] to address similar problems, with the advantage of being based 
on explicitly formulated probabilistic assumptions. With respect to this earlier contribution, our work offers several 
clarifications, notably on the nature of the different phases, and various extensions, notably to the BEC channel. 
With this particular channel, our results are analytical, and, in the high-noise regime, we conjecture them to be exact. 
Recent mathematical results on the typical phase diagram [36] foster hope for a confirmation of our results in that 
context. 

From a statistical physics perspective, error exponents are interesting for the richness of their phase diagram, which 
comprises two phase transitions of different natures. These transitions are observed when the level of noise p is varied 
at fixed rate R (or, cquivalently in the special case of random codes, when the rate R is varied at fixed p). Close 
to the static threshold, for pe < p < pc, errors are mostly due to the proliferation of many incorrect codewords 
in the vicinity of the received message. We interpreted this feature in terms of the presence of a glassy phase, 
and, accordingly, we were able to describe this regime by considering a one-step replica symmetry breaking (IRSB) 
approach. Below pe, errors become dominated by the effect of single isolated codewords, which we attributed to a 
transition towards a ferromagnetic state, or IRSB to RS transition. The noise pe has its counterpart in the "critical 
rate" Re of information theory [4], which marks the point below which only bounds on the reliability function are 
known. The replica symmetric (RS) approach wc employed to investigate the regime p < pe also turns out to be only 
approximate, except in the limit of infinite connectivity, where we recovered the error exponents of random linear 
codes [7]. We also described a second transition occurring at Py < Pe, below which atypical codes come to dominate 
the average exponent, causing it to differ from the typical error exponent. As it takes place in the space of graphs, 
this is an example of critical phenomenon whose description is not accessible to the standard cavity method [14], but 
only to its extension to large deviations [15] (see also [37] for an other example). However, this second transition 
should be taken with utmost care, as it relies on an approximate ansatz. 

The numerous efforts made in the information theory community to account for the low rate regime R < R^. have 
so far resulted only in upper and lower bounds for the reliability function [6] . Maybe not too surprisingly, this is also 
the region of the phase diagram where our methods encounter difficulties. Several examples are however now available 
which demonstrate that statistical physics methods can provide exact solutions to notoriously difficult mathematical 
problems. The solutions thus obtained generally sharpen our comprehension both of the system at hand and of 
the techniques themselves, besides often paving the way for rigorous derivations. In the light of some recent such 
achievements, extending the present statistical physics approach to reach a thorough understanding of error exponents 
seems to us a valuable challenge. 
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APPENDIX A: A NOTE ON THE EXPONENTIAL SCALING 

The thermodynamic approach is based on the assumption that the leading contribution to the probability of error 
decays exponentially with N. However, as initially shown by Gallager, for ensembles of LDPC codes, the probability of 
error decays only polynomially in N to the leading order. In physical terms, this is due to a few codes (whose number 
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is a polynomial in N) which display a second, metastablc, ferromagnetic state at a smaller distance from the ground 
state state (corresponding to the correct codeword) than the numerous configurations forming the paramagnetic state. 

To overpass this spurious effect in the simplest, yet purely theoretical way, Gallager focused on the so-called 
"expurgated ensemble" where the half of the codes with smallest minimum distance is disregarded. On this restricted 
ensemble which excludes the codes with multiple ferromagnetic states, the error probability decays now exponentially 
in N at the leading order and can be characterized with an average error exponent. Needless to say, this construction 
only makes sense as a convenient theoretical way to access good codes. 

As the large deviation method automatically overlooks any polynomial contribution, its results actually apply to the 
"expurgated ensemble" . This is however only true to the extend that the expurgation does not affect the distribution 
of graphs in the ensemble (i.e., does not change the distribution of degrees, of loops, etc.). This is presumably the 
case, as supported by the construction presented in [38], where an expurgated ensemble much tighter than Gallager's 
one is defined by explicitly associating to any random code an expurgated code obtained by modifying only a number 
0(1) of small loops. 

APPENDIX B: RANDOM LINEAR MODEL 
1. Definition 

A parity-check code is defined by a M x matrix A over Z2 and its codewords are the vectors x = (.ti, . . . ,xn) 
satisfying Ax = 0. Code ensembles are therefore subsets of the set of all 2^^^ possible matrices. Taking this complete 
set (with all possible matrices having same probability) defines the so-called random linear model (RLM). In contrast 
with LDPC codes, since a typical matrix from the RLM is not sparse, the belief propagation algorithm cannot be 
used to decode. While of little practical interest due to this absence of efficient decoding algorithm, the RLM has 
however two major theoretical advantages, both originating from its "maximally random" nature: typical codes from 
the RLM saturate the Shannon bounds, and error exponents can be derived rigorously. We review here some of the 
established results, which we used in the main text as a reference point to compare our non-rigorous results. Error 
exponents for the RLM are indeed expected to provide upper bounds for error exponents of LDPC ensemble, which 
are reached only in the limit of infinite connectivity k,l 00 (this limit is similar to that in which p-spin models 
approach the random energy model when p — > 00 [27]). 

2. Weight enumerator function 

We first characterize the geometry of the space of codewords by means of the so-called weight enumerator function. 
Given a code C with matrix A, this function gives the number Nc(d) of codewords x at (Hamming) distance d = 
|x| = X^ili from the origin: 

A/'c(d)=^<5|^d,5l2;.^'^(Ax,0), (Bl) 

where the sum is over all codewords, and 5{x, y) enforces the constraint x = y. The average weight enumerator 
function is obtained by averaging over the code ensemble and satisfies 

JJ{d) = ¥.cWc{d)]= {^^2-^^ -e^^'^^^^='^'^\ S](i?,(5) = (i?-l)ln2 + ff(5), (B2) 

where the limit of infinite block-length A^ ^ 00 is taken with M ~ N{1 — R) and d ~ Nx. The exponent I](i?, x) 
defines the so-called average weight enumerator exponent. A critical distance is the distance 5gv{R) defined as the 
smallest 5 > such that S(i?, 5) = 0. Codewords at distance d = N5 with 5 > 5gv{R) proliferate exponentially. On 
the other hand, the probability of existence of a codeword at distance d — N5 with 6 < ScviR) is upper-bounded by 
77{d), and thus decays exponentially with A^. Consequently, for any e{N) such that e(A^) — > 00 (e.g. e{N) = a/A"), 
only an exponentially small fraction of the codes in the ensemble have a minimal non-zero distance d = NS smaller 
than NSgv{R) — e(A^)- Excluding these "worst" codes from the RLM defines the expurgated RLM ensemble. 
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3. Average error exponent over the BEC 

Due to the group symmetry of the set of codewords, we can assume without loss of generahty that the transmitted 
codeword is (0, . . . , 0). For a given reahzation of the disorder due to a BEC, we denote by i? C {!,..., A^} the subset 
of erased bits in the received string, and d the number of elements in E. If A is the M x N matrix representing 
the code, the sub-matrix induced by A on defines the decoding-CSP problem: decoding is impossible if and 
only if the kernel of A^ is non-zero. When all matrices A are sampled with uniform probabilities as in the RLM, the 
sub-matrices A^ arc also represented with uniform probability. Given a noise realization E of magnitude d, the error 
probability is the probability that a random M x d matrix A^ is non-injective, 

Ec[P^V)] = J2 -p)^"''lP(3x ^ such that i^x = 0). (B3) 

When d > M, A^ is necessarily non injective. When d < M on the other hand, a straightforward inductive 
argument [8] gives 

d-l 

P(3x ^ such that i^x = 0) = 1 - ]J (1 - 2*"*^) (B4) 

1=0 

consequently, the exact expression for the average error probability of the RLM reads 

d=0 ^ ^ \ 1=0 / d=M+l ^ ^ 

In the N ^ oo, this expression can be evaluated by the saddle point method. When p < (1 — + the dominant 
contribution comes from the first sum, with 

J2 (l - n (1 - 2'-^^^ ^g-iV[(l-i^)ln2-ln(l+p)]^ (gg) 



d=0 



and the typical number of errors d = N2p/{1 + p). When p > (1 — i?)/(l + i?), (and p < 1 — i? to stay below the 
capacity), the dominant contribution comes from the second sum, with 

N / \ 

E (^)/(l-p)'^"'><e-^^(i-«llf). (B7) 



d=M+l 



and the typical number of errors d ~ N{1 — R). We thus obtain for the average error exponent of the RLM the 
expression given in Eq. (62), 

f (l-i?)ln2-ln(l + ») if »< i^, , , 

Ei(RLM) = { \ ^ ^' , B8 

^ ' \d{1-R\\p) if i_|<p<i_J?. ^ ' 

In physical terms, the transition between the two regimes can be interpreted as a transition between a ferromagnetic 
(RS) phase and a glassy (IRSB) phase. In the large noise regime, p > {1 — R)/{1 + R), the error is indeed most 
probably due to the noise driving the received string into a "glassy phase" of exponentially numerous incorrect 
codewords, as reflected by the fact that then P(3x 7^ such that A^x = 0) = 1. In contrast, in the low noise regime, 
p < (1 — i?)/(l-fi?), the error is most probably due to the noise driving the received string into a "ferromagnetic 
phase" where an isolated incorrect codeword happens to be closer than the correct codeword; this is reflected by 
the fact that P(Elx 7^ such that A^:x. = 0) differs from 1 only by an exponentially small term in N, as seen from 
Eq. (B4). 



4. Average error exponent over the BSC 



With the binary symmetric channel (BSC), starting again from the transmitted codeword is (0, . . . , 0), the received 
string y cannot be decoded if there exists x 7^ such that Ax = and |x — y| < |y|. Denoting -Pe(y) the probability 
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of this event, the probability of error is 



E 



(0)] 



AT 

E 

d=0 



(B9) 



where y^''^ is a generic string of weight d, e.g. yi ~ 1 if i < d, yi — ii i > d. If d/N > 5gv{R), Pe{y^'^'') goes to one in 
the infinite block-length limit. Although no published proof is available in the literature, it is reported as proved [7] 
that, when d/N < 6gv(,R)j Pe{y'^) is asymptotically equivalent to its union bound approximation (see the following 
appendix), i.e., 



^0(d-|x-y('^)|)5(^x,O) 



(BIO) 



d 

^Ec[AAc(z,y('^))] 

1=0 

]Ec[AAc(d,y (''))] 



(Bll) 
(B12) 

where Afcihy^'^'^) is the number of codewords at distance i from y'-'*-', and 6{x) = 1 if x > 0, and otherwise. 
Straightforward combinatorics shows that the asymptotic behavior of Ec-^cihy'^) is given by the standard weight 
enumerator exponent S(i?, i/iV). In the limit N oo where S = d/N is kept fixed, a saddle-point evaluation leads 
to the following expression of the average error exponent: 



Ei{RLM) 



max [S(i?,(5) 

S<Scv 

(l-R) In 2 -In 
Di6GviR)\\p) 



m\p)] 



l + 2y/p{l-p) 



if 



VP 



otherwise. 



(B13) 
(B14) 



This results with two distinct regime is very similar to that obtained previously for the BEC. 



APPENDIX C: UNION BOUNDS 



The so-called union bound exponent is a rigorous lower bound of the average error exponent in the expurgated 
ensemble. We show in this appendix how the average weight enumerator exponent of (regular) LDPC codes can be 
used to derive this union bound exponent, for both the BEC and the BSC. We will thus recover results first established 
by Gallager in [4, 39]. In a nutshell, the idea of the union-bound is to upper-bound the probability that at least one 
(bad) codeword causes an error by the sum of the probabilities that each does. Remarkably, this union bound turns 
out to be tight for the RLM ensemble. 



1. Weight enumerator function 



The weight enumerator function (see Eq. (Bl) for the definition) of regular LDPC codes with fc = 6 and d = 3 was 
computed in [4] and reads: 



EcWc(d)] = ^<5(|x|,d)Ec[<5(^x = 0)] 

X 

EcWcid^SN)] X e^^^'^''^^). 



Er 



S{Ax.' 



(d) 



with I](fc, I, 6) = mm ( 2^jM + (1 - i)H{S) + - In C{fi) ] , 



and C{^) 



(l + e-2M)'= + (l„e-2'^) 



(CI) 

(C2) 
(C3) 

(C4) 



We introduce dm, the smallest 5 such that E(fc,Z,(5) > 0. By construction, the average enumerator exponent in the 
expurgated ensemble is 



E(fc, /, S) if S](fc, I, S) > 0( i.e. if (5 > 5,n), 
— oo otherwise. 



(C5) 
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This expurgated average enumerator exponent Scxp(fc, U ^) is beheved to coincide with the typical cimmcvaiov exponent 
[40, 41]. 



Union bound for the BEC 



Given the set E of erased bits, we want to estimate the probabihty Pe{d) that the CSP-decoding problem has at 
least two solutions, when a code C is drawn at random from its ensemble. We call A the matrix characterizing C, 
the sub-matrix induced by A on and d the number of erased bits. The union bound consists in the following 
inequality: 



Pe{d) = P(3i e {0, 1}'^ ^ such that i^i = 0) 



< min 



^P(i^i = 0),l 



(C6) 
(C7) 



Let w = |x| and x be constructed from x by setting Xi — xi for i ^ Xi ~ Q otherwise; x belongs to the kernel of A 
if and only if x belongs to the kernel of A. The probability of the latter event reads 



^cWc{w)] 



N 



The error probability is consequently bounded by: 



N 

E 

N 



d=0 



N 



E 



Ec[Mc{w)] 



N' 



w 



,1 



(C8) 



(C9) 



(CIO) 



In the infinite block-length limit, a saddle-point estimate yields, as upper-bound for the expurgated average error 
exponent, the exponent 



Ec^p{k, I) > EuB = -max ^-D{S\\p) + min max (^E(cj) + F (^-) -iJ(w)^,oj| 



= — max 



-D{6\\p) 



max mm 



H 



+ 2fi£Lu - iH{Lu) + ^ In C'ip) 



(Cll) 



where S = d/N, oj = w/N, and Sub is the largest S such that max^^ (S(tj) + H ( j) — H{uj)) is non-positive. 

As p is varied, three regimes can be distinguished . For small p, the maximum over uj is reached on the boundary 
dm, meaning that errors are dominated by the nearest codewords. For large p instead, the maximum over 6 is reached 
at Sub, in which case the union bound is simply replaced by 1, physically corresponding to a large number of bad 
codewords arising from the large amplitude of the noise. Finally, in the intermediate region of p, the extremum is 
reached in the interior of the {6,uj) domain. Note that this last regime is not always present when k and £ are too 
small (for k = 6 and ^ = 3 in particular). These three regimes are given in the limit k,i oo by: 

-(5Gi/(i?)lnp ifp<py 
(l-i?)ln2-ln(l+p) ifpy<p<i^. 



£;o(RLM) = <^ (l-i?)ln2-ln(l+p) ifpy<p<j^, (C12) 
[d{1-R\\p) ifi^<p<i_i?. 

with py defined as in (69). Union bounds for the BEC are plotted in Fig. 12 for several regular ensembles. 



3. Union bound for the BSC 

The union bound for the BSC is derived following the same steps than for the BEC. The counterpart of Eq. (C6) 
reads 



Pe{d) = P(3x ^ such that |x - y^'^'l < d and Ax = 0) 



(C13) 
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FIG. 12: Expurgated union bounds for the BEG (left) and the BSG (right). From bottom to top, [k,i) = (6, 3), (8, 4), (12, 6) 
and the RLM limit, expurgated (top full curve) and not expurgated (bottom full curve) with R = 1/2. The points indicate the 
transition between the three regimes, as well as euB- 

where y^''-' is a generic string of weight d. Let x be a string a weight w and Q{w, d, g) be the probabiHty for y''') to 
be at distance g from x, conditioned on ly^**^ | = d: 



Q{w,d,g) 



w 



N~w \ /N^ 
^{d-g + w)/2j \{d + g-w)/2) \ d ^ 

The probabiHty for y'-''-' to be at distance g from any codeword x is upper-bounded by 

Y,^cWc{w)]Q{w,d,g) 



and we can write 



Pe{d) < min 



Y,^cWcH]Qc{w,d,g),l 



J2^cWc{w)]Qc{w,d, d),l 



(CM) 



(C15) 



(C16) 



From this incquahty and Eq. (C9), we obtain the union bound for the error exponent via the saddle-point method: 



-Ecxp(fc, > EuB = - max |-D((5||p) + min max (S(w) + L(w, 5, 5)) , | 



• max < —D{5\\p) + max min 

S<SuB I i^>Sm M 



2^i£Lu + (1 - £)H(lu) + - In C(//) + L(uj, S, S) 
k 



(C17) 



2uj y ■ ' V 2(1 -tj) 

As for the BEC, three regimes can be distinguished, according to the value of p. In the limit k,£ —* oo, these three 
regimes are: 



L(c., 5, 7) = + (1 - u:)H ( ) - H{5). 



-5Gv{R)\n \2^p(l-p)\ 
£;o(RLM) = <^ (1 - i?) In 2 - In 1 + 2yjp{l-p) if Py < p < Pe, 

D{Sgv{R)\\p) ifpe<p<SGv{R) 



(C18) 



where py and pe are given by (91) and (85). 

Union bounds for the BSC are plotted in Fig. 12. 



APPENDIX D: IRREGULAR CODES 



1. Definition of tlie ensemble 



In this appendix we discuss the generalization to irregular graphs. We shall only treat the entropic large deviations 
with the BEC, but our arguments can easily be generalized to the other cases. With irregular codes, it is necessary 
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to specify more precisely the definition of the ensemble. The usual definition is via the degree distributions V( and Ck- 
It is however possible to define different ensembles having same distribution and sharing the same typical properties, 
but differing at the level of atypical properties, including error exponents (see also [15] for similar non-equivalences 
in an other context). 

The simplest construction takes all factor graphs with exactly ViN checks of degree i, CkM variables of degree k, 
and pick them with uniform probability. Such ensembles are used to build actual codes, and we shall therefore analyze 
them with some details. 



2. Average error exponent 

We revisit the arguments of section III B and emphasize the differences with the regular case. 
A crucial modification is the introduction of Lagrange multipliers enforcing the number of nodes of each degree. Call 
the number of variables of degree i, and Mk the number of checks of degree i. Denote ni = Ng/N, ruk = Mk/N. 
The rate Li is now a function of the and mj,. Its multiple Legendre transform is defined as: 

(^(x, {Xf }, {i^k}) = xs + ^ Xini + ^ i/feTOfe - Li 

I k (Dl) 

with X = dsLi \e ^ dneLi Vk ^ dm^Li 

Let us consider the addition of a new bit. £ checks are added along with it, where I is drawn with probability 
Each of these checks, in turn, is connected io ka — \ old bits (a = 1, . . . ,^), where fca is drawn with probability 
kaCk^/{k). Eq. (31) is modified in the following way: 



i {ki,...,ke} a=l ^ ' •' 

Zk = -\\nj dASP^\AS) e 



exp 



cAS" + ^{{ka - l)zk^ + i/fcj + 



(D2) 

The addition of a variable of degree £ is reflected by a factor e^* , and the addition of a check of degree fc by a factor 
e'^'= . Call fc-degree the degree of a variable with respect to checks of degree fc. Here Zk is related to the increase of 
fc-degrecs in the ensemble. Let us consider for a moment a more general setting, where the ensemble is determined 
by the fc-degree distributions, denoted by v^^'^ [42]. Then Zk is defined by 



'^^^ — ^) — (°3) 

where Sv^/''^ = I'^'l^ — ^l*^''- ^k is obtained in a very similar way as z in (37): 



zk = -\hij dASP^\AS) e^^^, (D4) 



where P^\AS) now depends on the degree fc. 

The cavity equation (24) is modified in a very similar way as the expression of 0i in (D2). The inversion of the 
Legendre transformation allows to recover the relevant quantities: 

s = 8x4) Ui = dx^cj) mk = d^^(j) (D5) 
Replacing P^^_^^--'''\AS) and P^\AS) by their values, we obtain: 



xs-Li^ In [v{A) + J5(2^ - l)w(S)] 



with A = c^'J2 ^e(''^-i)^'=+''^ [2-^ + (1 - 2-")(l - z/)*^] , B = 2^"e^^ ^ ^j,(fe-i)..+.. (1 - (1 - J^)'=-'), 



.fc = 4ln[2-^ + (l-2-^)(l-.)^^ ''^"'^''^ 



fc L 'V ' i k' v'{A) +p{2^ -l)v'{B)' 



(D6) 
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FIG. 13: Average error exponent of a given code as a function of the noise level p for irregular codes with Ck = {l/2){5k,6 +<5fc,8) 
and ve = {l/2){5e,3 + 5k,4) through the BEG. 



To evaluate Li as a function of s, we simply need to tunc the parameters Xi and such that the conditions 
Hi = Vi and uik = ack arc satisfied. 

In Fig. 13, we represent the error exponent for the irregular ensemble with v(x) ~ (l/2)a;^ + (l/2)a;^, c{x) = 
(1/2).t6 + (l/2)x8. 

APPENDIX E: CALCULATIONS IN THE BSC 
1. Belief Propagation and the Bethe approximation 

In this section we write down the BP equations for a given code over the BSC, or cquivalcntly the cavity equations 
at the RS level. The expression of the free energy is also given. 
The cavity equations read: 

(El) 

Tb-i j&b—i 

^(i^a) ^j^^ probabiUty that the variable i takes the value Ti in the absence of a, and (It~^^^ is proportional to the 
probability that the variable i takes the value Ti when connected to 6 only. 

Denoting pi*^"^ = e^''*^"'^*/ cos\i(3hi^a and gi^^*^ = e^"''-''^'/ cosh/3ub^i, the cavity equations simphfy to: 

b^i—a 

(E2) 



Ub- 



= u{{hj^b}) = — atanh j JJ^ tanh/3/ij^5 j 



The local magnetization is given by (ct^) = tanh/Ji/^, with = hi + X^asi ^a^i- T^^^ Bethe approximation to the 
free energy reads: 

FnsiP) =Y.AF,^ ^(fc, - 1)AF„ (E3) 

i a 

with AF, = AF,+aeo{{ua~.^}) = 4 E ^^t^ cosh{Pua-.^)] - 4 In 2 cosh iph,+pJ2 ) 



AF. . AFn({/._}) - ^1 + n... tanh^;.... 



(E4) 



(3 V 2 
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Define: 



P{h) 



1 



N{1) 



Er 



1 



N{1) 



'^5{U- Ua^i) 



J2 - h^^a) Q{u) 
Averaging (El) over the codes, the noise and the edges, we obtain the self-consistency equations 



a=l 
k~l 



A: ^ ' '' 1=1 

where = Hq with probability 1 — p and —ho with probability p. The RS free energy reads: 

/fls(/3) = ^«£ / n duaQK)(Af^o+n6o(/i?,K}));,^-5Icfc(fc-l) / l[dh,Pihi)AFa{{h,}) 



(E5) 

(E6) 
(E7) 

(E8) 



2. Large Deviations 



As in the BEC, we study the statistics of BP over the codes, under the measure oc exp[— a;//3/Fcorr(/?/) 
a;e/3e^Rs(/3e)]- The large deviation cavity equations read, for a regular code: 



P{K) CX / J] dWa g( 



S(h-h^- Y!aJl ^a) C"'^'""^ h cosh [Peih + Ea=l 



nti[2cOsh(/3e^Xa)]" = 



Q{u) ^ ( W d/ii P{hi)5 u - iatanh [ ^ tanh(/3pft,i) 
And the potential: 

. i (e'3W'e[2cosh(/3e(/ic +ELi^a) 

Ua) 



(fc- l)ln f Y[dh,P{h. 

i=i 



nLl[2cOsh(/3e«a)]" = 

l + ntitanh(/3e/i,)' 



The solution to (E9) is obtained numerically. In the limit k,i oo, this solution simplifies: 

Q{u) = d{u) P{h) = (1 - p)5{h - ho) + pS{h + ho) 

yielding the error exponent (83). 

Another solution, called "type I" in [10], also exists: 

Q{u) = r/S+^{u) + (1 - ?7)(5_oo(w) P{h) = v5+oo{h) + (1 - v)5-oo{h) 

with 



(E9) 



(ElO) 



(Ell) 



(E12) 



+ (1 -r/)^-i (e-2y?^o-T)^' 

We automatically have Sp = 0, and the condition fp = ff implies m — Pe^e = 1/2. Then the rate function reads: 



(E13) 



Liifp = ff) = -4> = - In W + (1 - i-iY (e-'""^) J - -(fc - 1) In 



-(1 + (2^.-1)^) 



(E14) 



This solution (E12) is numerically unstable and the rate function thus obtained is clearly unphysical. However, for 
fc, ^ — > 00, ijk = 1 — R, we have 1] = 1/ — 1/2 and the resulting rate function 



Liifp ^ff)^- In ^ f 1 + 2y/p{l-p)) - i?ln2 = ln2(i?o(p) - R) 



(E15) 



coincides with the error exponent of the RLM in the low p regime (B14). 



31 



3. Two-step large deviations 

The potential ip{Pe,'m,y) defined in (87) is obtained by extremizing the following expression with respect to P(h) 
and Q{u): 



2 cosh {jie{h^ + Yfa^l ""a) 



m//3e\ ^ y 
I he 



|(fc-l)ln j\[dh,P{h,) 



n:^l [2cOSh(/3eWa)] 

l + ntitanh(/3e/iO^ 



(E16) 



We can only handle this calculation in the k,£ ^ oo limit. (Ell) is still a solution in this case, and yields: 

ipiPe, m, y) = y^iPe, m), (E17) 

where 0(/3e,m) is obtained from the average case. Therefore, the typical exponent is the same as the average error 
exponent in the high p regime. 

There also exists a counterpart of solution (E12), which gives: 



^(/3e, m, y) = (i? - 1) ln2 + In [l + ((1 - p)i->™ + ^^-"(1 - p)™)^' 
The condition dmfp = is again enforced by setting m = 1/2. Thus we get: 

^iy) ^-yL-C={R-l) ln2 + In [l + (2^p{l-p)'^ 
This expression yields the rate function C{L) by inverse Legendre transformation. 



(E18) 



(E19) 
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